Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Revert naive compression format #32

Merged
merged 1 commit into from
Jul 22, 2024

Conversation

Satrat
Copy link
Contributor

@Satrat Satrat commented Jul 22, 2024

SUMMARY:
Rather than float and int quantization sharing a format, we infer int-quantized for uniform integer quantization and float-quantized for uniform fp8 quantization. Any non-uniform quantization will still default to naive-quantized

TEST PLAN:
Updated unit test with new expected defaults

@Satrat Satrat changed the title Revert naive-compression format Revert naive compression format Jul 22, 2024
@robertgshaw2-redhat robertgshaw2-redhat merged commit 07c1fd7 into main Jul 22, 2024
8 of 12 checks passed
markmc pushed a commit to markmc/llm-compressor that referenced this pull request Nov 13, 2024
* group size

* add logic in base observer

* group size full lifecycle run

* before vectorize the for loop

* comments, todo add channelwise

* chan wise impl

* comments

* fix channel wise

* comments, validators

* fix typo

* tensor return error fix

* fix sparseml-side of code and add per channel

* pyndatic defaults

* token wise quant

* Update src/compressed_tensors/quantization/quant_args.py

Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>

* comments'

* update dim

* shape consistency

* Update src/compressed_tensors/quantization/lifecycle/forward.py

Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>

* comments

* pass test_quant_args

---------

Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants